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Abstract 

Few attempts have been proposed in order to describe the statistical features and historical evolution 
of the export bipartite matrix countries/products. An important standpoint is the introduction of a 
products network, namely a hierarchical forest of products that models the formation and the evo¬ 
lution of commodities. In the present article, we propose a simple dynamical model where countries 
compete with each other to acquire the ability to produce and export new products. Countries will 
have two possibilities to expand their export: innovating, i.e. introducing new goods, namely new 
nodes in the product networks, or copying the productive process of others, i.e. occupying a node 
already present in the same network. In this way, the topology of the products network and the 
country-product matrix evolve simultaneously, driven by the countries push toward innovation. 


Introduction 

In the economic growth the different, endogenous and exogenous, functional requirements that let a 
firm pursue products involve the transformation and combination of tangible and intangible attributes 
[I], such as bureaucratic environment [2], infrastructures |3|, education [4], etc. All these features drive 
either the technologic improvement in the firm production chain [5], or the firm diversification within 
a country [6], or the introduction of new products. Current models of economic growth consider the 
relation between the inputs of country goods production and their effects on the overall productivity 
mmm, without taking into consideration the measure of the inputs diversity m- 


Economic Complexity, mmmmmmmmm, is a new expanding field in the economic anal¬ 
ysis, which represents a framework to measure the competitiveness of countries and the complexity of 
products from the national export baskets. The central object of study of this approach is the binary 
export matrix M, obtained by imposing a threshold on the Revealed Comparative Advantage (RCA) 
m on the coutry-product trade volumes matrix. The matrix M can be thought as the biadjacency 
matrix of the bipartite network ETjumg in which one layer is represented by countries and the other 
by exported products. 

In order to quantify the competitiveness of countries from the hidden information in M, a new metric 
for countries and products has been proposed in mm, overcoming flaws and problems of the seminal 
work m ■ The basic idea of m is to define a non-linear map through an iterative process which couples 
the Fitness of countries to the Complexity of products. At every step of the iteration, the Fitness F c 
of a given country c is proportional to the sum of the exported products, weighted by their complexity 
parameter Q p . On the other hand the complexity Q p of a product p is non linearly related to the fitness 
of its exporters so that products exported by low fitness countries have a low level of complexity and 
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high complexity products are exported by high fitness countries only. 


The historical evolution of M shows the development paths followed by the different countries in terms 
of their export flow. It is possible to build a taxonomy network for products directly from the time 
evolution of the export baskets of countries urns]. In this way the development pattern followed by 
different countries can be predicted as the dynamics on an evolving products network. 


In this paper we present a dynamical model that describes the evolution of the export baskets of countries 
by implementing a minimal network model of products innovation processes, which is able to reproduce 
with good accuracy the main features of the observed evolution of M. The keystone of our model 
is the existence of an evolving hierarchical products network in which each country occupies a subset 
of nodes; within this framework, the products innovation is represented by the introduction of new 
nodes in the products network. Borrowing the definition from [24] we distinguish between “novelties” 
and “innovation”: “innovation” is something that is new for the whole community, while “novelty” is 
something known, which is new just for an individual. In this way, the novelty can be “copied” from the 
near neighbours, while the process that takes to the innovation depends just on the single individual. 
There are three main factors that drive the evolution of the export basket of countries and the innovation 
dynamics of products in our model: 

1) the country ability to diversify its basket; 

2) the competition within a similar sector of products; 

3) the ability to produce innovation with respect to the simple technological updating by adopting 

already developed technology by other countries. The update of the export basket of a country can 
take place in two ways: i) as an imitation process from other countries, introducing a novelty, ii) as 
the development of a brand new product, introducing an innovation. The technological updating 
is equivalent to the novelties introduction present in m- 

Our model makes the products network and the M matrix evolve simultaneously, mutually conditioning 
one each other: indeed, the country and product that will evolve are chosen on the matrix M at that 
time, but the kind of evolution is decided on the basis of the products network. When the country 
develops a new product to export, following its path on the products network, it will alter the original 
M matrix by modifying the products network. In this way, the efforts made by countries to develop new 
technologies modify in real time the path that other countries can take to diversify their own export 
basket. 

The paper is organized as follows. In the section “Methods” we first introduce the ingredients of our 
model, such as the data set examined (for the comparison of our model with real data) and the network 
of country and products; then we illustrate in details our algorithm in the subsection “The Model”. In 
“Results” we analyse our results, which are going to be further commented in the section “Discussion”. 


Methods 

Dataset 

The dataset on which we test our model is UN-NBER Site Rev2 m, edited by m- From the import 
registered by the UN, the exports of the World Trade Web (WTW) is reconstructed for nearly 2577 
products categories for the years interval 1963-2000. After a data cleaning procedure in order to fix some 
incoherences, the number of products in the analysed years interval have been fixed to 538, while the 
number of countries oscillates between 130 and 151, due to geopolitical changes. 
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The country-product network 

Economic Complexity p~Tl fT2l fT3l fl4l fT5l T6l 17, 18, 19] focus on the analysis of the bipartite network of 
countries and exported products. We start from the export volumes matrix q : every entry q cp represents 
the total amount of exports in USD of the product p by the country c. In order to binarize g, the RCA 
(Revealed Comparative Advantage , [20]) is calculated: 


RCA cp 


def 


Qcp 

Q c 


y -! c r Qc' p 

, p' Qc' p' 


( 1 ) 


The philosophy at the basis of Eq. 0 is to give a non dimensional measure of how the export basket of 
a specific country is organized respect to the average, comparing the impact of the product p on the the 
export basket of c respect to the impact of p on the global export basket. In the light of that, we can 
impose a threshold on the RCA-matrix, obtaining the binary M— matrix: if m cp is the entry for the M 
relative to the country c and the product p, then 


r if RCA cp > 1 1 

m cp — \ ^ 

[ if RCA cp < 1 0 

i.e. only exported products exceeding the RCA threshold appears in the basket of a country. An export 
basket in which just raw materials are over the threshold of the RCA (so that appear in the M matrix) 
denotes limited industrialization, while a diversified one, from highly exclusive products to most simple 
ones, implies a completed industrialization. The matrix M can be thought as the biadjacency matrix of 
a bipartite network, in which the first layer, corresponding to the row index c, is composed by countries, 
while the second layer, corresponding to the column index p, is composed by the products. Links are 
permitted only between nodes of different layers. 


Traditionally m , the degree of the nodes, i.e. the number of links per node, are called diversification 
for countries and ubiquity for products; in terms of M they can be respectively expressed as 

7 def 7 def 

K c — / J Wlcpi l^p — / J W^cp • ( 2 ) 

p c 


The model 

Our model focuses on the historical evolution of the M- matrix, i.e. the biadjacency matrix relative to the 
bipartite undirected binary network of countries and exported products. The evolution of M is driven 
by the evolution of the products network |12j|38], i.e. a hierarchical network based on the productivity 
processes such that two different products are linked if there is the possibility of passing from one to the 
other by a technological improvement. The product network takes the topology of a forest in which the 
“roots” represent the ancestors product, like raw materials, while the most outer leaves are the highest 
technology goods. 

The tree-like topology may appear as a great simplification, in the sense that a certain production 
could be affected even by a “distant” technological improvement. Anyway, the topology proposed has 
been shown, mnu, to be a reliable tool able to capture the main features of countries productivity 
evolution; it is indeed remarkably that a so simple structure can correctly reproduce the evolution of 
countries diversification. 

A pictorial representation of the products network can be found in the left part of the top panel of Fig. 
1: the network nodes represent different products and the links are the technological relationship, while 
colored disks occupying a given node stand for the different countries (one color for each country) able 
to export the given product. 

We superimposed the information contained in matrix M on the topology of the products network in 
order to give a more immediate interpretation of the mechanism of the export baskets evolution we are 
proposing. 
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Let us recall some definitions from m- in this inspiring paper focused on the nature of innovation, the 
authors distinguish between “novelties” and “innovations”. A novelty is a tool, a webpage, a song or 
any item you can think about that is relatively new, literally already present in the common knowledge, 
but not already experienced by the single agent (a person or, as in the present article, a country); in 
contrast, an innovation is something that has never appeared in the set of known items, so it is new for 
everyone. 

In our model, a country develops its export basket progressively occupying a subset of the product 
network. More specifically, at each time step of the algorithm, a selected country either occupies nodes 
already occupied by other country or creates a new node by sprouting in the product network nearby one 
of those already present in its export basket. Considering as the set of nodes that a country can occupy 
at a given time steps just the closest products its own export basket is similar to Kauffman conjecture, 
[28] , about the “adjacent possible” nature of evolution: Kauffman proposed that the innovation process 
takes place only on the border of one’s own set of knowledge as items close to the borders are the most 
probable to be investigated and introduced. 

Our algorithm is implemented as the sequential iteration of three fundamental substeps at each time 
step: the first one decides the country that will enlarge its basket; the second substep selects the product 
that will drive the evolution on the product network; finally, the third one will decide the path of the 
country basket evolution, either creating a brand new product (thus, innovating), or copying a product 
by adopting already developed technology by other countries (thus, introducing a novelty). 

In details, the 3 sequential substeps of our model are the following: 

1) The country selection: Divesification. At the first substep, a country c has a probability to be 
selected 


(3) 

where k c is its diversification, as defined in Eq.Q, and a > 0 is a parameter of the model. 
Normalization of the all countries is imposed to Eq.(|3| in order to evolve a single country at each 
time step. This first substep is similar to the generalization of the preferential attachment presented 
in [29], with the difference that here we select a node on one layer of a bipartite network, while in 
[29 a node was selected in a monopartite one. 

Equation ©> says that countries with a diversified export basket have a higher probability to be 
chosen: it implements the idea that the diversification can be taken as a good proxy for efforts a 
country makes in order to evolve its export basket (a wider discussion is developed in [30]). The 
selected country is the one performing the evolution of matrix M in the next two substeps. 

In the second panel of Fig. [l] the first step is pictorial shown: among the three countries (red, blue 
and green), represented by different disks in the upper layer of the bipartite network, the red one is 
selected and its export basket in the product network highlighted. Note that the diversification k c 
is the number of nodes occupied by each country, so we have k re d = 8, k^ue = 5 and k green = 16. 

2) The evolving product selection: Competition. Once the country c has been chosen, we have 
to select a product already in its export basket, from which either moving towards an unoccupied 
existing neighbouring nodes or sprouting a brand new one. We select such a product p with a 
probability: 

p 2 (p\c) ~ k% (4) 

where k p is the ubiquity, as defined in Eq.([2]) and /? > 0 is the second parameter of the model. 
Similarly to Eq. ©, Eq.Q implements the generalization of the preferential attachment criterion 
of [29], applied to the product layer of our bipartite network. 

The idea is that the more producers, the harder the efforts on renovation, the more possibilities 
of improving the export basket may come from the most ubiquitous products. In effect, similar 
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behaviours has been shown by experimental evidences, m- 

The third panel of Fig. [l] illustrates such second substep: from the commodities present in the 
“red” country export basket, p = “4” is selected (while on the bipartite network k p is the number 
of countries linked to p, in the products network it is represented by the number of differently 
coloured disk occupying the selected node). 

3) Target product selection: Innovation against Novelty. In the previous substeps, we have se¬ 
lected the country c and the product p (in the pictorial representation of Fig. [I] the “red” country 
and the product “4”) performing the last evolution substep. This choice was based on the proper¬ 
ties of the matrix M at that time steps; more precisely on diversification and ubiquity defined in 
Eq. <§. So far, no information from the products network topology has been used. 

Let us now consider the position the chosen p occupies in the products network. There are two 
options: either introducing a new node in the products network, i.e. innovating, or evolving along 
the links already present in the products network by introducing in the export basket a product 
neighbour of p already exported by other countries, i.e. introducing a novelty. 

The probability of copying novelties from others will increase with the number of countries that 
already export the given good. In this way we implement the idea that it is much easier to acquire 
close technology. On the other hand, the innovation process of proposing brand new products will 
need an extra parameter. 

As possible novelties we consider all the first neighbours p' of p in the product network which are 
not already present in the basket of c. At the same time let us call p* a possible brand new product 
sprouting out of p. We call p the generic element of the set obtained by the union of p' and p *. In 
this third substep we select a single element p of this set with the probability given by: 

P 3 (p\c,p) ~ (kp + fc 0 ) 7 , (5) 

where 7 , k° > 0 are the last two parameters of the model and k p is the ubiquity of p; clearly 
k p * =0. The quantity k° > 0 is a necessary an offset permitting even the artificial product p* to 
be selected. The k p term, makes the probability of “copying” other “accessible” products made 
already by other country larger than introducing an innovation. 

The third panel of Fig. [l] illustrates this third substep: in the second panel of the same Figjl] we 
selected the product p = “4”. 

Now the possibilities are between either selecting the product “ 8 ”, already produced by the “green” 
country, or introducing a brand new product p* = “20”. In the picture we represent pictorially 
this last event. 

In our algorithm we iterate these three substeps until the number of products in the networks is the 
same of the observed matrix we want to reproduce, i.e. 538 for the examined data. Summarizing the 
parameters of the model are <a, /3, 7 , fc°. 

The density saturation 

For all the simulated values of the parameters once the number of products introduced in the network 
reaches the same number of the observed bipartite network, the density of links is however smaller than 
the observed one. Consequently starting from this time-step we will set k° = 0 in order to prevent 
the creation of new products and permit only the introduction of novelties increasing in this way the 
density of bipartite network. We stop the iteration of the algorithm when the density of the country- 
product bipartite network saturates to the observed one is p cr ~ 0.13. Interestingly we observed that 
a similar behaviour is shown by real data as illustrated by Fig. [2] In this figure we can appreciate 
that the density of the bipartite network increases from year to year up to 1975 and then saturate to 
an approximate constant values. One possible explanation of this phenomenon is probably the fact that 
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products categories of Site Rev .2 have been formalized in 1980 and data for the years before 1980 have 
been converted to the Site Rev. 2 by the Site Rev.l and some data may have been lost in merging the 
datasets. 

Initial Conditions 

In order to render the final results of our model less sensible to the initial conditions, we compose the 
initial exports basket of each country as a Bernoulli trial. We start with a given number N roots of 
initial products, hereby called roots. We assign each product of this set to each country with a constant 
probability Pq, independently of the other countries. Consequently each country will have a random 
number of initial products selected from the set of roots, whose probability distribution is binomial with 
mean value N roots Po and standard deviation yj N roots Po ( 1 — Pq ) . At this level the product network is 
a set of disconnected nodes, represented by the roots; at the first step of the evolution dynamics the 
product network will evolve from this initial condition as a branching process in which the branching 
event is represented by innovation. 

Exploring parameter space 

As explained in the Supplementary Material, in order to find the best range for the values of the pa¬ 
rameters characterizing both the initial condition, (Woots, To), and the evolutionary model, (a, /?, 7 , fc°), 
we tested for each choice of the first pair, in the range of reasonable values of N roots from 10 to 40 
and of Po from 0.2 to 0.6, a wide range of the dynamical parameters. We found that the performance 
of the algorithm weakly depends on the precise choice of (a,/ 3 , 7 , fc°) around reasonable values for the 
exponents a,/3 ,7 around the unitary value, and for k° around the minimal observed value k° = 4 for 
the product ubiquity k p in real data. The slightly best performing set of values is however found to be 
Nroots = 20 , P 0 = 0.3, a = 1.55, /3 = 0 . 8 , 7 = 0.3, k° = 4. 

Main Results 

We compare the matrices M generated by our algorithm for different values of the parameters (a, /?, 7 , k° ) 
with the observed one using several non-trivial quantities characterizing the features of binary bipartite 
networks. In particular we will use the countries Fitnesses and products Complexities distributions, as 
defined in [14[ 1J5 ; nestedness [32] : assortativity |33]; motifs for bipartite networks |3H 35 . 

Fitness and Complexity In spite of the simplicity of our the model, there is a remarkably good 
agreement between simulations and real data for Fitness/Complexity (for details about the defi¬ 
nition of Fitness and Complexity, see the Appendix). In particular the shape of the scatter plot 
for countries fitnesses (products complexities) ranking against countries diversifications (products 
ubiquities) reproduces well the behavior in the real data; the result is shown in Fig. la) (Fig. 
[3jb)). It is possible to see that our algorithm is able to reproduce the “shape” of the original 
matrix data (blue dots) within the 95%, which is a remarkable result, since these peculiar trends 
are derived by the highly non-linear algorithm for Fitnesses and Complexities. 

Nestedness The concept of nestedness relates to how much a row (or a column) is subset of the oth¬ 
ers. In this sense, it is a way to evaluate the “triangularity” of the binary matrix M. Among 
the different definition of nestedness O EH OS ES], we opted for the NODF ( Nestedness mea¬ 
sure based on Overlap and Decreasing Fill), presented in [32]. This choice is motivated by the 
fact that is independent form the order of the elements and it is particularly intuitive. The final 
value of nestedness is the sum of the contributions by the columns (i.e. products) and the rows 
(i.e. countries), weighted by the possible couple of elements (for details about the definition of 
the NODF, see the Appendix); for completeness, we analysed the contributions from countries and 
products separately, i.e. NODF c and NODF p respectively, as well as the total contribution, NODF t . 

As it can be seen in the Fig. [4| panels c, the NODF p of real data is well replicated by our algorithm. 
This means that our model is able to catch the main features of the hierarchical organization of 
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products; this result is due to the assumption of the presence of a products network, i.e. a hierar¬ 
chical structure among products. The result for NODF c , in the panel b of Fig. [4j is much more 
non trivial: in this case the values from the simulations reproduce the same quantity for the real 
matrix, even though no explicit structure was imposed on the set of countries. 


As a matter of fact countries, by following productivity paths on the products network, impose a 
nested structure to countries as a consequence of the i) hierarchical structures of products, and 
ii) the mechanism of the products network evolution . Since poorly diversified countries cover a 
subset of the products network of the most diversified one, the matrix M appears nested respect 
to rows too. 

Assortativity As proposed in [33], assortatitvity is a measure of how much nodes link nodes with 
similar degree (more details can be found in the Appendix). Since the poorly diversified countries 
focus their export toward the most ubiquitous products, the export bipartite network is disassor- 
tative, i.e. a network in which low degree nodes links to high degree ones. Our model provides a 
value inside the 95% of probability, but, respect to other measurements, the assortativity shows 
a worse agreement between the simulations and real data, as can be seen in Fig. id)- Actually, 
this behaviour is probably due to a sort of “second order” effect: the most diversified countries do 
exports even the most ubiquitous products, but their export basket is nevertheless biased towards 
the highest quality products. In effect, the distribution in Fig. shows that the model provides 
a less disassortative bipartite network respect to the real one (in red in the figure). 

Motifs. Let us represent an entry in the M —matrix as a square which is empty (□) if m cp = 0, while 
it is filled (■) for m cp = 1. The checkerboard score [34], i.e. the number of pattern gg and gg 
inside the M —matrix counts the mutually exclusive exported products for two different countries. 
As shown in Fig. [5ja) , the total number of checkerboards is finely reproduced by the model; the 
agreement with real data means that the evolutionary algorithm well replicates the diversity of 
technological development roads followed by countries. 

In [351136] several motifs for bipartite networks have been proposed in order to uncover the structural 
properties of the system at hand; in the following we will consider just the simplest ones, V —motifs 
and A—motifs. In few words, they represent the number of co-occurrence of products in the 
basket of two different countries (the V— motifs) or the co-occurrence of countries in the set of the 
producers of two different products (A—motifs); more details on the definitions of the motifs can 
be found in the Appendix. 

In most of the cases the real value of the number of A—motifs falls at the borders of the first 
3-quantile and far inside the area contained between the 25th and the 975th permilles (that is, the 
area containing the 95% of the probability around the mean value), see Fig. [5^c) . 

On the countries set we did not assume any kind of structure and it is probably the cause for not 
reproducing the total number of V— motifs, see Fig. [5|b). In effect, simulated data always fall out 
of the 95% of the probability around the median of real data: apparently the evolutionary paths 
described by the products network is not enough for reproducing V— motifs. 

The first 3 measurements, i.e (i) the fitness and complexity scattered against the nodes degrees, (ii) the 
nestedness and (iii) the assortativity, explicit the hidden information encoded in the triangular shape of 
the matrix M, while the motifs carry part of the topological information of the bipartite network. Our 
evolutionary model is able to replicate those measures, showing that the products network created by 
our algorithm can be a good starting point to better understand the hidden forces which produce the 
main characteristics of the export bipartite network structure. 


The evolution 

The total probability of introducing an innovation can be obtained assembling Eqs. 0-0 and 0: 
P(P*) = £ A (c)P 2 {c\p)P 3 {p*\c,p) 

c,p 

^ ( 6 ) 
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The evolution of the probability of innovating respect to the total number of products exported at that 
time has been reported in Fig. [ 6 ] different colors represent different values of fc°, while the other param¬ 
eters (a,/3, 7 ) have been fixed respectively to ( 1 ., 1 . 6 , 0.4). At the early stage of the evolution dynamics, 
the probability of innovating is obviously close to 1 , as the number of branching is negligible with respect 
to the number of roots. This happens until the total number of products is around ~ 50 for all the 
value examined of fc°. After that threshold, the possibility of following path already developed by others 
countries reaches a higher value. 

The plot in Fig. [ 6 ] shows that there are two phases: a first period of “great discoveries” (until the 
total number of products is < 50) in which the topology of the early products network is shaped, and a 
second period in which the technology innovations diffuse, under the form of novelties, among countries. 
Note also that slopes for different values of k° cluster: while k° = 1 is almost alone, sketching a steep 
trajectory, other values are next to each other. 

A similar, but different, discussion can be made about the evolution of the probability of countries selec¬ 
tion, given by Eq. ©• In Fig. [7] the evolution of this probability for single country is plotted (differently 
from Fig. [ 6 ] on the horizontal axis the evolution time is plotted). This plot clearly shows the effect of the 
late saturation time interval (when, as explained above, k° = 0 ) on the selection probability of a single 
country: from an initial almost uniform diversification, few countries start becoming more and more 
diversified, increasing in this way their probability of being selected at further times, and, consequently, 
reducing the same probability of poor ones due the normalization over all countries. In the saturation 
time interval, represented by the cyan area in Fig. [7| we note a shrinking of the selection probability 
distribution over countries, due to the prevention of further innovation. Since just novelties are per¬ 
mitted at this final stage mid-diversified countries improve their chances of diversifying, while already 
developed are restrained by the fact that they are already extremely diversified and novelties occur with 
less frequency. 

The same effect can be observed directly on the distribution of diversification over countries; Fig. [ 8 ]shows 
that there is an abrupt raise in the diversification during the evolution for few countries, while others 
experience a slower evolution (again, on the horizontal axis there are the time steps of the simulation). 
Moreover, the most diversified countries, i.e. those which feel strongly the decrease in the probability of 
being selected due to the saturation, show an S —shaped profile of the evolution curve, Fig. | 8 jb): after a 
steep raising slope, the saturation time determines a slow increase towards a limiting value. Figure | 8 jc) 
illustrates that not all countries show such a S —shaped behaviour due to the presence of the saturation 
stage, but just those with high diversification; others do not occupy the products network enough for 
feeling the difference between the products network growth and the saturation regime. The presence 
of such an S —shape curve is also typical in evolutionary models in biology and socio-economics, when 
resources are limited, mm- 


Conclusion 

The main target of the Economic Complexity approach, is to unveil, 

through the information contained in the binary bipartite network of countries and exported products, 
the productive capabilities of different countries and the industrial hierarchical space. Quite surprisingly 
with respect to some celebrated economic theories, the biadjacency matrix M, defining the bipartite net¬ 
work, exhibits a peculiar approximately triangular shape. This shows that the most diversified countries, 
i.e. those able to export a wide class of different products, are the only ones able to export both the most 
technologically advanced goods as well as the simplest ones. On the contrary poorly diversified countries 
usually export only ubiquitous products, which in general bring a low level of industrial complexity. This 
new approach permitted to reach many interesting results about the competitiveness of countries and 
the complexity of products. In particular the construction of a products network, defining a hierarchy 
among products, permits to determine the different paths followed by countries in the product space to 
develop their export basket mm- 

In this framework we proposed a simple dynamic evolution algorithm that, starting from general initial 
conditions is able to reproduce the main features of the observed countries/products bipartite network, 
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as different measures for the “triangular” shape of the observed M— matrix, in a wide range of the pa¬ 
rameters around reasonable values. The central ingredient of the evolutionary model is the progressive 
and self consistent construction of the product network, encoding the different steps of the technological 
progress. 

Our model provides the simultaneous evolution of the matrix M and the products network; the dynam¬ 
ical evolution of the latter at the same time drives and is driven by the evolution of the matrix M, as 
the technological evolution depends tightly on the productive capabilities of the different countries, i.e. 
on the nodes each country occupies in the products network. 

The proposed model is able to reproduce the main features of the observed bipartite network for a wide 
range of parameters. In particular we compared the “shape” of the matrix M, as encoded in the Fit¬ 
ness/Complexity algorithm [ 14l [T5l [T6l IT71 Fl9] : the measures of the nestedness (in the definition proposed 
[32]) and the assortativity, [33]; some motifs, like the checkerboards patterns, [34] and the V and A, 

[351 [36]. 

For the range of parameters examined we find the observed values of all quantities with a single minor 
exception (V— motifs) inside the 95% of the distribution of simulated data. 

Our model is meant to be a first step in the direction of a dynamical network approach to the processes 
of countries innovation and competition on the exports. There are several possible directions of improve¬ 
ment, as implementing different evolutionary rules for the construction of the product network and the 
modeling of the countries dynamics on it. 

Another possible direction could be the introduction of an appropriate random process of losing products 
from the export basket, simulating exogenous phenomenon as the progressive obsolescence of “old” 
products or the presence of socio-political factors (as wars, traditions, political resolutions, etc.). All 
these (and other) possible approaches are going to be studied in following works. 
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Appendix 


Fitness and Complexity 

In order to extract the information contained in the M- matrix, the authors of mens ns mi , overcoming 
flaws present in the seminal works nans], propose a metric for countries and products, the celebrated 
Fitness and Complexity algorithm: this recursive and non linear algorithm is a sort of PageRank applied 
to bipartite networks, where Fitness is the quantity for countries, while Complexity is the one for prod¬ 
ucts. The idea at the the basis of the algorithm is that highest fitness countries are those which are able 
to export the highest number of the most exclusive products, i.e. those with the highest complexity. In 
particular, the Fitness F c for the generic country c and Quality Q p for the generic product p at the n— th 
step of iteration, are defined as 


Fc = E v ™cpQ ( v 


Qi n) 




F c (n) 

(F c (n) ) 


< 


£ 


C 171C P p(n-l) 


, , o (n) 

n ( n ) _ Wp 


(7) 


where the symbols (•) indicate the average taken over the proper set. The initial condition are taken as 
Fc = Qp = 1 Vc G 7V C , \/p G N p , where N c and N p are the number respectively of countries and products 
(the convergence of the algorithm described by Eqs. ([ 7 ]) depends on the shape of the matrix M, as it has 
been discussed in DEI). 


Non trivial benchmarks 

Fitness and Quality distributions 

Using Fitness and Complexity it is possible to reveal several non-trivial properties of the M- matrix: the 
very first observation is that, once reordered rows and columns respectively by fitness and complexity, 
the M- matrix shows a peculiar triangular form, already observed in biological systems, [22 E3 Ei [39] 
. The triangular shape of M shows that even the most diversified countries do not export just the most 
exclusive products, but even the common ones PH EES EES M EESEES EES M EES • 

The form of the distribution for the Fitness (Quality) ranking against country diversification (products 
ubiquity) depends strongly on the shape of M. They are sparse distributions, due to the non-linearity 
of the algorithm of Fit ness/Complexity, which cause even the peculiar shape, as shown in the Fig. §a, 
b): real data are blue points, while the cloud represent the frequency of the simulated data. 

Nestedness 

As already mentioned, the “triangularity” of the M- matrix is a typical feature of biological mutualistic 
networks. Traditionally, in the biology literature the triangular form of the biadjacency matrix is mea¬ 
sured by the nestedness , i.e. how much a row (or a column) is a subset of the others. In the literature 
there is a great amount of different proposals for the nestedness definition, (221ISZ1IS2, 38, 421122]; here 
we decide to use the NODF ( Nestedness metric based on Overlap and Decreasing Fill) presented in [32 
because, according to us, it is the most intuitive. Using the definition of Eqs. <§, 

m c P iTic'p 

Min{& c , k c '} . 

0 

^2/C 717 C P 717 cp' 

Min{/cp, k p >} 
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rpr def ) 

1 nn' - \ 


rj-ic def 

±pp> — 


kc kef 

otherwise 
kp k p / 
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The total nestedness measure NODF t is then 


^\iV c rpr 

NODF t d = 2- <c ' cc ' 


E N p n~>c 

p<p' ^pp' 


N C (N C - 1 ) ■ 


N P (N P — 1 ) ’ 


(8) 


where N c and N p are the number respectively of countries and products. Note that the final value of 
the nestedness gets contribution just in case the number of the non-zero elements of the two rows (or 
columns) considered is different (for that, the name “decreasing fill” in the NODF acronym). Eventually, 
the final formula for the nestedness is the weighted sum of the contributions from rows and from columns, 
so it is possible to isolate the two different nestedness for rows (i.e. countries, NODF c ) and for columns 
(i.e. products, NODF p ): 


NODF c = 2 


E rjir 
c<c' ^ cc' 

N C (N C — 1 ) 


NODFp = 2 


V T c 

Z-^p<p' pp' 


N P (N P — 1 ) ’ 


(9) 


Since usual matrices M are quite “rectangular”, i.e. the number of products is much greater then the 
number of countries, the total NODF Q is biased by the contribution from the products. In fact, 
combining Eqs. and ©: 


NODF = 


Nc(N c - l)NODF c + N P (N P - l)NODF p n p ^n c 

N c (N c -l) + N p (N p -l ) ~ ' 


( 10 ) 


Because of the previous relation, for the analysis of the initial conditions we just compared the results for 
NODF c and NODF p , while the effect of the previous approximation is shown in the comparison between 
the Table [l] and the Table [3] 


Assortativity 

The assortativity parameter r [33] has been introduced in order to measure how much nodes tend to be 
linked by nodes with a similar degree. More in details, r takes values from -1 to 1, where -1 denotes a 
network perfectly disassortative, i.e. lowest degree nodes are linked to the highest degree ones, while 1 
denotes a network perfectly assort at ive, i.e. high degree vertices links with highest degree one; in terms 
of the adjacency matrix and node degrees, r can be written as 

def 2 M (2 ^2 c p k c m cp k p — k 1 2 + k 2 ) 2 ^) 

2 M ((E C k c + Ep k p) - (E c k c + E P k l) 2 ) ’ 

For the previous discussions about the triangularity, one may expect that the value of assortativity 
is negative (since poorly diversified countries exports the most ubiquitous products), but with a low 
absolute value, say much less than 0.5, since high degree countries have low complexity products in their 
export basket, as well as more complex ones. This expectation is just partly satisfied, since the value of 
r for the matrices observed is indeed negative, but its absolute value is quite large, say of the order of 
0.6: at a second look at the Fig. |3|c), in effect, it is possible to see that the density of the export basket 
of most diversified countries moves toward the low degrees products, i.e. the most exclusive ones. 


Motifs for bipartite networks 


Some motifs for bipartite networks have been defined in the contest of biological mutualistic networks. 
One of the most used is the checkerboards score [34 , i.e. the number of patterns of 2 x 2 submatrices 
present in the biadjacency matrices as mutually-exclusive terms such as gg, gg 11 The checkerboards 
score, in other words, measures the how much mutually exclusive are the choices made by different 
countries about the composition of their export basket. The total number of possible checkerboards 
patterns can be written as 


^checkerboards — ^ EE m c 

c,c' p,p' 


>(l m cp t)(l m c 'p)m c 


1 gg and gg are meant to be 2 x 2 patterns in the M —matrix where ■ and □ represent respectively m cp = 1 and 


m, 


cp 


= 0. 


2 Since we prefer absolute measures, thus not depending on the order of rows and column, in the total number of 
checkerboards patterns we consider the occurrence of both gg and gg . 
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Other several motifs for bipartite networks have been proposed in [35] in order to uncover the structural 
properties of the system at hand. Among others, we decide to focus on V and A motif^] Respectively, 
the total number of V —motifs and A—motifs count the total co-occurrence of products in two different 
export baskets and of countries in the set of producer of two different goods; in term of the entries of 
the biadjacency matrix M, they are defined as 

n v = f Y Y m cp TO c' P ; 

c<c' p 

n a = f Y Y mc P mc p '• 

p<p' c 


Tuning parameters 

The model, described in details in section “The model”, has a four-dimensional parameter space. In order 
to determine which of the values of (cq /?, 7 , fc°) are more compatible with observations we generated 
matrices for all possible values of the parameters and compared the results with the observed data. In 
particular, we focus on initial conditions with N roots < 25, according to the conclusions of the following 
section. In the Tables EiSMiEI simulated data have been generated for initial conditions with 
Nroots = 20 and P 0 = 0.3. 

We observed that the parameter k° appears not to be influential on the measures for k° > 4. It is worth 
noticing that 4 is even the average value of the minimum of the ubiquity for real matrices; effectively 
k p = 4 is the first value of the ubiquity for which the probability of a novelty exceed the probability of 
innovating. In the following we will present results with k° = 4; for every (cq /3, 7 ) configuration we 
generated 56 matrices. 

As the Tables [TJ [ 2 ] [3] [4| [5] , [7] show, once f3 and 7 are fixed, i.e. the parameters of the second and the third 
choices, the value of the a parameter which are able to replicate real data are quite narrow around the 
value a = 1.6. Instead, for a fixed value of cq the best results are in an area around the “anti-diagonal” 
of the tables shown, so for decreasing /? values once 7 grows and vice versa. 

The results for the number of V —motifs in Table [ 6 ] deserves a special treatment: in fact, while, the 
A—motifs are well reproduced, it happens quite rarely that the model is able to replicate the observed 
results. The meaning of the results reside in the definition of V— and A—motifs: Vs (As) are the number 
of co-occurrence of countries (products) in the set of producers for every product (in the export set for 
every country). In effect, our algorithm is driven by a hierarchy imposed on the products, while no kind 
of structure is forced onto the countries set, so the constraints drive the total number of As, while the 
number of Vs is too “free”. 

Tuning Initial Conditions 

In Table [ 8 ] we compare different values of N roots from 10 to 40 (at steps of 5 roots) and different Po, from 
0.2 to 0.6 (at steps of 0.05); we fix the values of the parameter for the evolution dynamics among the 
best performing ones, according to the previous analyses, i.e. a = 1.6, /3 = 1, 7 = 0.6 and k° = 4; for 
every initial conditions configuration we produced 50 simulated M— matrix. The results can be observed 
in the Fig. [ 8 ] for low values of N roots , the discrepancy among different distributions relative to different 
values of Po is limited and very often cross the red line representing real data. Because of it, we mostly 
used low N roo ts < 25 , since they need less fine tuning on the Po- Moreover, higher Po are less precise, 
especially for a higher number of roots: thus it seems that the mean number of roots per countries to 
make the algorithm start should be quite small, say ^ 6 . 

We tried, imposing some offsets for every choice, even to make the algorithm start from no product, but 
the result are not satisfying since the usual measures on the matrix are not replicated. 


3 The names V (A) come from the fact that once the two layers of the bipartite network are rotated such that the 
countries layer is over the products one, the former (the latter) motifs look like “V”s (“A”s) between the layers. 
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Figure &; Table Legends 
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Country 1 


| Country 2 


► Country 3 


New 

Product 


Figure 1: Model Evolution We propose a pictorial example of one iteration of the evolution of our 
model: the bipartite network of countries and exported product is on the right, while the the products 
network is on the left. Coloured (red, blue and green) disk represent countries, while products are circles 
numbered from 1 to 20; in the bipartite network the link are coloured as the country they refer to. 
For completeness, we projected the information from the bipartite network on the products network, 
superimposing coloured disk on a given node if the country is able to export the single product. In the 
top panel of Fig. [I] we present the initial condition, in which the red, blue and green countries have their 
export products and occupy different nodes in the products network. In the second panel of Fig. [l] the 
first substep of the algorithm is taken: following the recipe described in the section “The method” the red 
country is selected. The products in “red” export basket are highlighted both on the bipartite network 
(on the right) and on the products network (on the left). At the second substep, among the products in 
the “red” export basket, “4” is selected in the third panel of Fig. [T] Together with the previous step, we 
have selected a link in the bipartite network. On the left, the possible choices on the products network 
for the third substep: since “10” is already in the “red” export basket, it cannot be selected as the final 
target for the evolution, so the only possibility left are product “8” (already produced by “green”) or 
“20”, a brand new product. In the bottom panel of Fig. [l]the thirds choice has been taken and “20” is 
a new product in the products network. 
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Figure 2: The density evolution for the dataset [25J[2Hj. It is possible to observe the density increasing until a 
certain value p cr ~ 0.13 for the year ~ 1975; our model follows a similar behaviour, limiting the evolution to the 
export of existing products once the number of nodes in the products network reaches the number of observed 
products in the real network. 
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Figure 3: In Fig. |3p) the scatter plot of Fitness ranking against countries diversification, while in Fig. §b) 
the one for Quality ranking against products ubiquity; the blue points represent the observed values (for the 
year 1980 from the dataset of [251l26| h The black line represents the average value on the simulations, while the 
grey lines bind the area between the second and the first 3-quantiles (dot-dashed) and between the 975th and 
25th permilles (dashed). The data obtained are for initial conditions A^ roo ts = 20 and Po = 0.3 and parameters 
a — 1.55, P — 0.8, 7 = 0.3, k° = 4. In the ~ 82% the observed data fall into the area between 975th and 25th 
permilles for the fitness distribution, ~ 75% for the quality distribution. In Fig. |3jc) the original matrix for 
1980 from the dataset of [2oli26j : in Fig. [3jd) one of the synthetic matrix for initial conditions N roo t s — 20 and 
Po — 0.3 and parameters a - 1.65, P — 1 . 1 , 7 = 0 . 6 , k° — 4. 
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Figure 4: The distributions for the nestedness values (obtained employing NODF, the definition by [32]) and 
assortativity index r (obtained employing the definition by [33]) for 50 simulations with initial conditions N roo ts = 
20 and Po = 0.3 and parameters a = 1.55, f3 — 0.8, 7 = 0.3, k° = 4. In Fig. [4^a) the total NODF, in Fig. Qb) 
the NODF for rows and in Fig. |4jc) the one for columns. The red line is the observed value for the year 1980 
from the dataset of [25] [26], the blue dashed lines bind the area between the second and the first 3-quantiles, 
while the dark blue dots mark the area between between the 975th and 25th permilles. For the 4 distributions, 
real values easily fit in the 95%; anyway, for NODF values the real values lie just outside the central third of the 
probability. Notice the similar distributions for NODFt and NODF p , as explained in Eq. (10). In Fig. [4][d) the 
distribution for the assortativity values (obtained employing the definition by [33]): Even if the distribution is 
quite weird, the value measured on the real matrix is just outside the area containing the 33% of the distribution. 
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Figure 5: The distribution for the number of checkerboards(Fig. i a)), V- (Fig. §b)) and A-motifs (Fig. ^c)), 
obtained from the simulation with initial conditions N roo t s = 20 and Po = 0.3 and parameters a = 1.55, f3 = 
0.8, 7 = 0.3, k° = 4. The red line is the observed value for the year 1980 from the dataset of [25l [26] . the blue 
dashed lines bind the area between the second and the first 3-quantiles, while the dark blue dots mark the area 
between between the 975th and 25th permilles. While [5ja) and [5jc) shows that the number of checkerboards 
A—motifs are reproduced by the model, in[5jb) the real value lies outside the 95% of probability; the presence of 
a hierarchy in the set of products captures the right values of checkerboards and A—motifs, but it is not enough 
to reproduce the V— motifs. 
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Figure 6: The evolution of the probability of innovation depending on k° (on the horizontal axis the total number 
of products at the time). It is possible to observe two different phases: a first period of discoveries, when the 
probability of innovating is close to 1 and a second period in which the spread of the novelties increase. It is 
worth noticing that all slopes for k° > 1 cluster together. 



Figure 7: The evolution of the probability of selecting every country; on the horizontal axis there is the simulation 
time. Until the saturation regime (the cyan area) few countries start increasing their probabilities of being selected 
with the increasing of their diversification, to the detriment of the poor diversified countries, whose probabilities 
are pushed lower. In the saturation period, the mid-diversified countries enlarge their export basket, boosting 
their probabilities of being selected, while highest diversified countries are restrained; in this way the gap among 
countries reduces. 
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Figure 8: The evolution of the diversification against the simulation time. In Fig. [8^a) all countries, but those 
whose initial conditions have been particularly unlucky, evolve, boosting their diversification; in the cyan area, i.e. 
during the saturation regime, highly diversified countries experience an evolution different from others. In effect, 
it is possible to observe, once focusing on the highest fitness countries, Fig. IJb), that all of them experience a 
growth with a S'—curve profile, which is peculiar of bi^jogic and economics system with finite resources. Note 
the other less diversified countries, Fig. [8^c) , this phenomenon is not present; effectively the main target of the 
saturation regime is reducing the gap between the diversification of fully developed and developing countries. 
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Table 1 : Parameters space analysis: NODF t . It is possible to observe the variation of the NODF t at the changing of the parameters <a, /3 7 ; the parameter 
k° has been kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. The NODF t measured on 
the original matrix is represented as a red line. The best values of the parameter a are the lowest analysed, i.e. a < 1.65. Instead, the wider area of 
acceptance for the 7 parameters is for the central area of the table, i.e. 0.4 < 7 < 0.6, while (3 is more or less non-influential on the the acceptance of the 
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Table 2: Parameters space analysis: NODF p . It is possible to observe the variation of the NODF p at the changing of the parameters <a, /3 7 ; the parameter 
k° has been kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. The NODF p measured on 
the original matrix is represented as a red line. The best values of the parameter a for reproducing NODF p are the lowest analysed, i.e. a < 1.65. Instead, 
the wider area of acceptance for the 7 parameter is for the central area of the table, i.e. 0.4 < 7 < 0.6, while /3 is more or less non-influential on the 
the acceptance of the measure. Note that all the graphs presented here are completely overlapping with the one of the Table [l] because for the analysed 
network the approximation of Eq.(10 holds. 
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Table 3: Parameters space analysis: NODF c . It is possible to observe the variation of the NODF c at the changing of the parameters <a, (3 7 ; the parameter 
k° has been kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. The NODF c measured on 
the original matrix is represented as a red line. The best values of the parameter a are the lowest analysed, i.e. a < 1.65, as in Table [2] In replicating the 
measure of NODF c the value of [3 and 7 are more non influential than in Tableland more or less all configuration (a,/3, 7 ) are able to correctly replicate 
the real value. 
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Table 4: Parameters space analysis: r. It is possible to observe the variation of the r at the changing of the parameters <a, /3 7 ; the parameter k° has been 
kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. The r measured on the original matrix 
is represented as a red line. Instead, the wider area of acceptance for the /? and 7 parameters is around the “anti-diagonal” of the table represent, so high 
value of 7 for low (3 and vice versa, while a is always centred over 1.65. 
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Table 5: Parameters space analysis: Ncheckerboards- It is possible to observe the variation of the ^checkerboards at the changing of the parameters 
<a, (3 7 ; the parameter k° has been kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. The 
^Checkerboards measured on the original matrix is represented as a red line. The wider area of acceptance for the (3 and 7 parameters is similar to the one 
of the Table [4] while as performing better are a little bit lower, say a < 1.65. 
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Table 6: Parameters space analysis: Nv- It is possible to observe the variation of the Ny at the changing of the parameters <a, /3y; the parameter k° has 
been kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. As it is possible to see in the present 
table, our model is not able to capture the number of V— motifs in the network for more or less none of the parameters analysed. This phenomenon is 
due to the fact that the model evolution is based on a hierarchical structure for products (the products network) that is not present for the countries: in 
effect, as Table [7] shows, there is much more agreement with the original data for the A—motifs. 
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Table 7: Parameters space analysis: N\. It is possible to observe the variation of the N\ at the changing of the parameters <a, /3y; the parameter k° has 
been kept fixed to the value 4, since no variation in any of the measure analysed has been observed for greater values. Respect to the Table |6]we find 
a better agreement in reproducing A—motifs: this fact is probably due to base the evolution of the model on a (evolving) structure for products, which 
keeps trace in the total number of A—motif, i.e. the number of co-occurrence of 2 different products in the exports baskets. As in the previous tables, the 
best results are obtained for low value of a and the area along the “anti-diagonal” of the presented table. 
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Table 8 : Initial Conditions analysis: in the Table [ 8 ] it is possible to observe how the measurements vary in changing the initial condition (for fixed 
parameter: a = 1.6, /3 = 1, 7 = 0 . 6 , fc° = 4) as compared with the values from the real matrix (the red line). Every boxplot contains the distribution of 56 
simulations. Initial conditions are performed assigning 7V roo ts initial products roots, each with probability Pq. As it is possible to see, the best agreement 
for most of the measures used is for 10 < 7V roots <25 and Pq < 0.4. 




































































































